CORONAVIRUS ANALYSIS IN INDIA

Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus.

  • Most people infected with the COVID-19 virus will experience mild to moderate respiratory illness and recover without requiring special treatment.
  • Older people, and those with underlying medical problems like cardiovascular disease, diabetes, chronic respiratory disease, and cancer are more likely to develop serious illness.

  • The COVID-19 virus spreads primarily through droplets of saliva or discharge from the nose when an infected person coughs or sneezes, so it’s important that you also practice respiratory etiquette (for example, by coughing into a flexed elbow).

  • The best way to prevent and slow down transmission is be well informed about the COVID-19 virus, the disease it causes and how it spreads.

  • Protecting oneself and others from infection by washing our hands or using an alcohol based rub frequently and not touching our face.

To prevent infection and to slow transmission of COVID-19, do the following:

  • Washing our hands regularly with soap and water, or cleaning them with alcohol-based hand rub.
  • Maintaining at least 1.5 metre distance between ourselves and people coughing or sneezing.
  • Avoid touching our face.
  • Covering our mouth and nose when coughing or sneezing.
  • Staying home if we feel unwell.
  • Refrain from smoking and other activities that weaken the lungs.
  • Practice physical distancing by avoiding unnecessary travel and staying away from large groups of people.

Importing Libraries

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import sys
import matplotlib as plt
from matplotlib.pyplot import *
import plotly.offline as plo
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import bar_chart_race as bcr
from IPython.display import HTML
import warnings
warnings.filterwarnings("ignore")
from IPython.display import Video

Exploratory Data Analysis and Data Visualization

Importing and Reading the Dataset

covid_19_india.csv

In [2]:
data = pd.read_csv(r"D:\DATA SCIENCE\Covid19 analysis\557629_1323860_bundle_archive\covid_19_india.csv")

Displaying the Dataset

In [3]:
data.head(10)
Out[3]:
Sno Date Time State/UnionTerritory ConfirmedIndianNational ConfirmedForeignNational Cured Deaths Confirmed
0 1 30/01/20 6:00 PM Kerala 1 0 0 0 1
1 2 31/01/20 6:00 PM Kerala 1 0 0 0 1
2 3 01/02/20 6:00 PM Kerala 2 0 0 0 2
3 4 02/02/20 6:00 PM Kerala 3 0 0 0 3
4 5 03/02/20 6:00 PM Kerala 3 0 0 0 3
5 6 04/02/20 6:00 PM Kerala 3 0 0 0 3
6 7 05/02/20 6:00 PM Kerala 3 0 0 0 3
7 8 06/02/20 6:00 PM Kerala 3 0 0 0 3
8 9 07/02/20 6:00 PM Kerala 3 0 0 0 3
9 10 08/02/20 6:00 PM Kerala 3 0 0 0 3

Displaying the Dataset Columns

In [4]:
data.columns
Out[4]:
Index(['Sno', 'Date', 'Time', 'State/UnionTerritory',
       'ConfirmedIndianNational', 'ConfirmedForeignNational', 'Cured',
       'Deaths', 'Confirmed'],
      dtype='object')

Displaying the Dataset Information

In [5]:
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3963 entries, 0 to 3962
Data columns (total 9 columns):
 #   Column                    Non-Null Count  Dtype 
---  ------                    --------------  ----- 
 0   Sno                       3963 non-null   int64 
 1   Date                      3963 non-null   object
 2   Time                      3963 non-null   object
 3   State/UnionTerritory      3963 non-null   object
 4   ConfirmedIndianNational   3963 non-null   object
 5   ConfirmedForeignNational  3963 non-null   object
 6   Cured                     3963 non-null   int64 
 7   Deaths                    3963 non-null   int64 
 8   Confirmed                 3963 non-null   int64 
dtypes: int64(4), object(5)
memory usage: 201.3+ KB

Displaying the Dataset Description

In [6]:
data.describe()
Out[6]:
Sno Cured Deaths Confirmed
count 3963.000000 3963.000000 3963.000000 3963.000000
mean 1982.000000 2791.492556 155.840020 5245.161746
std 1144.163887 9779.420988 685.676637 17698.383165
min 1.000000 0.000000 0.000000 0.000000
25% 991.500000 3.000000 0.000000 28.000000
50% 1982.000000 60.000000 3.000000 301.000000
75% 2972.500000 1224.500000 34.000000 2824.000000
max 3963.000000 127259.000000 9667.000000 230599.000000

Checking presence of some null values in the Dataset

In [7]:
data.isnull().sum()
Out[7]:
Sno                         0
Date                        0
Time                        0
State/UnionTerritory        0
ConfirmedIndianNational     0
ConfirmedForeignNational    0
Cured                       0
Deaths                      0
Confirmed                   0
dtype: int64

Changing the Date Format

In [8]:
data['Date'] = pd.to_datetime(data['Date'], dayfirst=True)
data.head()
Out[8]:
Sno Date Time State/UnionTerritory ConfirmedIndianNational ConfirmedForeignNational Cured Deaths Confirmed
0 1 2020-01-30 6:00 PM Kerala 1 0 0 0 1
1 2 2020-01-31 6:00 PM Kerala 1 0 0 0 1
2 3 2020-02-01 6:00 PM Kerala 2 0 0 0 2
3 4 2020-02-02 6:00 PM Kerala 3 0 0 0 3
4 5 2020-02-03 6:00 PM Kerala 3 0 0 0 3

Sorting values of Confirmed Cases in descending order depending on the State/Union Territory

In [9]:
data3 = pd.pivot_table(data, values=['Confirmed','Deaths','Cured'], index='State/UnionTerritory', aggfunc='max')
data3 = data3.sort_values(by='Confirmed', ascending= False)
data3.style.background_gradient(cmap='Wistia')
Out[9]:
Confirmed Cured Deaths
State/UnionTerritory
Maharashtra 230599 127259 9667
Tamil Nadu 126581 78161 1765
Delhi 107051 82226 3258
Gujarat 39194 27718 2008
Uttar Pradesh 32362 21127 862
Karnataka 31105 12833 486
Telangana 30946 18192 331
West Bengal 25911 16826 854
Andhra Pradesh 23814 12154 277
Rajasthan 22563 17070 491
Haryana 19369 14510 287
Madhya Pradesh 16341 12232 634
Assam 14032 8729 22
Bihar 13944 9816 115
Odisha 11201 7407 52
Jammu and Kashmir 9501 5695 154
Cases being reassigned to states 9265 0 0
Punjab 7140 4945 183
Kerala 6534 3708 27
Telengana 4111 1817 156
Chhattisgarh 3675 2903 15
Uttarakhand 3305 2672 46
Jharkhand 3246 2208 23
Goa 2151 1273 9
Tripura 1776 1338 1
Manipur 1450 799 0
Puducherry 1151 584 14
Himachal Pradesh 1140 846 11
Ladakh 1055 915 1
Nagaland 673 304 0
Chandigarh 523 403 7
Dadra and Nagar Haveli and Daman and Diu 411 189 0
Arunachal Pradesh 302 120 2
Mizoram 197 133 0
Andaman and Nicobar Islands 151 83 0
Sikkim 134 72 0
Meghalaya 113 66 2
Unassigned 77 0 0
Dadar Nagar Haveli 26 2 0
Daman & Diu 2 0 0

Total Cases in India

In [10]:
print("Total Cases in India:",int(data3['Confirmed'].sum()))
Total Cases in India: 803122

Representation using Bar Chart Plotting

In [11]:
data4 = [go.Bar(
    x = data3.index,
    y = data3[colname],
    name = colname
)for colname in data3.columns]

fig = go.Figure(data=data4)

plo.iplot(fig)

Confirmed Cases in the topmost five States - Graphical Representation

In [12]:
def plotly_graph_state(state):
    temp = data[data['State/UnionTerritory']==state]
    trace0 = go.Scatter(
        x = temp.index,
        y = temp['Confirmed'],
        mode = 'lines+markers',
        marker = dict(color='green'),
        name = 'Confirmed Cases in {0}'.format(state)
    )
    layout = go.Layout(
        title = 'Confirmed Cases in {0}'.format(state)
    )
    fig = go.Figure(
        data = [trace0],
        layout = layout
    )
    plo.iplot(fig)
In [13]:
all_states = list(data3.sort_values(by='Confirmed', ascending= False).index[:5])
for state in all_states:
    plotly_graph_state(state)

Scatter Plotting using Dots depending on various factors

In [14]:
sns.relplot(x="Cured", y="Confirmed", hue = "State/UnionTerritory", data=data)
Out[14]:
<seaborn.axisgrid.FacetGrid at 0x532ff28>

People are recovering at a faster side

In [15]:
sns.relplot(x="Confirmed", y="Deaths", hue = "State/UnionTerritory", data=data)
Out[15]:
<seaborn.axisgrid.FacetGrid at 0x12c6a520>

The death rate is much lower in India

In [16]:
sns.relplot(x="ConfirmedIndianNational", y="Confirmed", hue = "State/UnionTerritory", data=data)
Out[16]:
<seaborn.axisgrid.FacetGrid at 0x12ae4370>

Initially most case were detected in Foreign Nationals but as days flew by no. of cases in Indian Nationals turned out to be more and more.

In [17]:
sns.pairplot(data)
Out[17]:
<seaborn.axisgrid.PairGrid at 0x12a20280>

The scatter plotting is a bit confusing to get the representation. Actually, it provides an overall representation of the whole dataset.

Scatter Plotting using Lines depending on various factors

In [18]:
sns.relplot(x='Cured',y='Confirmed', kind='line', hue='State/UnionTerritory',  data=data)
Out[18]:
<seaborn.axisgrid.FacetGrid at 0x15b81b98>

Importing and Reading the Dataset

AgeGroupDetails.csv

In [19]:
data1 = pd.read_csv(r"D:\DATA SCIENCE\Covid19 analysis\557629_1323860_bundle_archive\AgeGroupDetails.csv")
In [20]:
data1.head(10)
Out[20]:
Sno AgeGroup TotalCases Percentage
0 1 0 to 9 22 3.18%
1 2 10 to 19 27 3.90%
2 3 20 to 29 172 24.86%
3 4 30 to 39 146 21.10%
4 5 40 to 49 112 16.18%
5 6 50 to 59 77 11.13%
6 7 60 to 69 89 12.86%
7 8 70 to 79 28 4.05%
8 9 >=80 10 1.45%
9 10 Missing 9 1.30%
In [21]:
data1.columns
Out[21]:
Index(['Sno', 'AgeGroup', 'TotalCases', 'Percentage'], dtype='object')

Representation of the dataset using Pie Chart

In [22]:
import matplotlib.ticker as ticker
import matplotlib.cm as cm
import matplotlib as mpl
from matplotlib.gridspec import GridSpec
import matplotlib.pyplot as plt
%matplotlib inline
In [23]:
labels = list(data1['AgeGroup'])
sizes = list(data1['TotalCases'])
explode = []
for i in labels:
    explode.append(0.05)    
plt.figure(figsize= (15,10))
plt.pie(sizes, labels=labels, autopct='%1.1f%%', startangle=9, explode =explode)
centre_circle = plt.Circle((0,0),0.70,fc='white')
fig = plt.gcf()
fig.gca().add_artist(centre_circle)
plt.title('India - Age Group wise Distribution',fontsize = 20)
plt.axis('equal')  
plt.tight_layout()

Importing and Reading the Dataset

HospitalBedsIndia.csv

In [24]:
data2 = pd.read_csv(r"D:\DATA SCIENCE\Covid19 analysis\557629_1323860_bundle_archive\HospitalBedsIndia.csv")
In [25]:
data2.head(10)
Out[25]:
Sno State/UT NumPrimaryHealthCenters_HMIS NumCommunityHealthCenters_HMIS NumSubDistrictHospitals_HMIS NumDistrictHospitals_HMIS TotalPublicHealthFacilities_HMIS NumPublicBeds_HMIS NumRuralHospitals_NHP18 NumRuralBeds_NHP18 NumUrbanHospitals_NHP18 NumUrbanBeds_NHP18
0 1 Andaman & Nicobar Islands 27 4 NaN 3 34 1246 27 575 3 500
1 2 Andhra Pradesh 1417 198 31.0 20 1666 60799 193 6480 65 16658
2 3 Arunachal Pradesh 122 62 NaN 15 199 2320 208 2136 10 268
3 4 Assam 1007 166 14.0 33 1220 19115 1176 10944 50 6198
4 5 Bihar 2007 63 33.0 43 2146 17796 930 6083 103 5936
5 6 Chandigarh 40 2 1.0 4 47 3756 0 0 4 778
6 7 Chhattisgarh 813 166 12.0 32 1023 14354 169 5070 45 4342
7 8 Dadra & Nagar Haveli 9 2 1.0 1 13 568 10 273 1 316
8 9 Daman & Diu 4 2 NaN 2 8 298 5 240 0 0
9 10 Delhi 534 25 9.0 47 615 20572 0 0 109 24383
In [26]:
data2.columns
Out[26]:
Index(['Sno', 'State/UT', 'NumPrimaryHealthCenters_HMIS',
       'NumCommunityHealthCenters_HMIS', 'NumSubDistrictHospitals_HMIS',
       'NumDistrictHospitals_HMIS', 'TotalPublicHealthFacilities_HMIS',
       'NumPublicBeds_HMIS', 'NumRuralHospitals_NHP18', 'NumRuralBeds_NHP18',
       'NumUrbanHospitals_NHP18', 'NumUrbanBeds_NHP18'],
      dtype='object')

Analysing the various factors present in the Dataset using Visual / Graphical Representation

In [27]:
data4 = pd.pivot_table(data2, values=['TotalPublicHealthFacilities_HMIS','NumPrimaryHealthCenters_HMIS','NumCommunityHealthCenters_HMIS','NumSubDistrictHospitals_HMIS','NumDistrictHospitals_HMIS'], index='State/UT', aggfunc='max')
data4 = data4.sort_values(by='TotalPublicHealthFacilities_HMIS',ascending= False)
data4.style.background_gradient(cmap='Wistia')
Out[27]:
NumCommunityHealthCenters_HMIS NumDistrictHospitals_HMIS NumPrimaryHealthCenters_HMIS NumSubDistrictHospitals_HMIS TotalPublicHealthFacilities_HMIS
State/UT
All India 5568 1003 29,899 1255.000000 37725
Uttar Pradesh 671 174 3277 nan 4122
Maharashtra 430 70 2638 101.000000 3239
Rajasthan 579 33 2463 64.000000 3139
Karnataka 207 42 2547 147.000000 2943
Tamil Nadu 385 32 1854 310.000000 2581
Gujarat 385 37 1770 44.000000 2236
Bihar 63 43 2007 33.000000 2146
West Bengal 406 55 1374 70.000000 1905
Madhya Pradesh 324 51 1420 72.000000 1867
Odisha 377 35 1360 27.000000 1799
Andhra Pradesh 198 20 1417 31.000000 1666
Kerala 229 53 933 82.000000 1297
Assam 166 33 1007 14.000000 1220
Chhattisgarh 166 32 813 12.000000 1023
Telangana 82 15 788 47.000000 932
Jammu & Kashmir 87 29 702 nan 818
Punjab 146 28 521 47.000000 742
Haryana 131 28 500 24.000000 683
Himachal Pradesh 79 15 516 61.000000 671
Delhi 25 47 534 9.000000 615
Jharkhand 179 23 343 13.000000 558
Uttarakhand 69 20 275 19.000000 383
Arunachal Pradesh 62 15 122 nan 199
Meghalaya 29 13 138 nan 180
Nagaland 21 11 134 nan 166
Tripura 22 9 114 12.000000 157
Manipur 17 9 87 1.000000 114
Mizoram 10 9 65 3.000000 87
Puducherry 4 4 40 5.000000 53
Chandigarh 2 4 40 1.000000 47
Goa 4 3 31 2.000000 40
Andaman & Nicobar Islands 4 3 27 nan 34
Sikkim 2 4 25 1.000000 32
Dadra & Nagar Haveli 2 1 9 1.000000 13
Lakshadweep 3 1 4 2.000000 10
Daman & Diu 2 2 4 nan 8
In [28]:
labels = list(data2['State/UT'])
sizes = list(data2['TotalPublicHealthFacilities_HMIS'])
explode = []
for i in labels:
    explode.append(0.05)    
plt.figure(figsize= (15,10))
plt.pie(sizes, labels=labels, autopct='%1.1f%%', startangle=9, explode =explode)
centre_circle = plt.Circle((0,0),0.70,fc='white')
fig = plt.gcf()
fig.gca().add_artist(centre_circle)
plt.title('India - Hospital Bed wise Distribution',fontsize = 20)
plt.axis('equal')  
plt.tight_layout()
In [29]:
sns.pairplot(data2)
Out[29]:
<seaborn.axisgrid.PairGrid at 0x15e89ca0>

Importing and Reading the Dataset

StatewiseTestingDetails.csv

In [30]:
data5 = pd.read_csv(r"D:\DATA SCIENCE\Covid19 analysis\557629_1323860_bundle_archive\StatewiseTestingDetails.csv")
data5.head()
Out[30]:
Date State TotalSamples Negative Positive
0 17-04-2020 Andaman and Nicobar Islands 1403 1210 12.0
1 24-04-2020 Andaman and Nicobar Islands 2679 NaN 27.0
2 27-04-2020 Andaman and Nicobar Islands 2848 NaN 33.0
3 01-05-2020 Andaman and Nicobar Islands 3754 NaN 33.0
4 16-05-2020 Andaman and Nicobar Islands 6677 NaN 33.0
In [31]:
data5.columns
Out[31]:
Index(['Date', 'State', 'TotalSamples', 'Negative', 'Positive'], dtype='object')
In [32]:
data6=data5.drop(['Date'],axis='columns')  
data6.head()
Out[32]:
State TotalSamples Negative Positive
0 Andaman and Nicobar Islands 1403 1210 12.0
1 Andaman and Nicobar Islands 2679 NaN 27.0
2 Andaman and Nicobar Islands 2848 NaN 33.0
3 Andaman and Nicobar Islands 3754 NaN 33.0
4 Andaman and Nicobar Islands 6677 NaN 33.0
In [33]:
data6 = pd.pivot_table(data6, values=['TotalSamples'], index='State', aggfunc='max')
data6 = data6.sort_values(by='State',ascending= True)
data6.style.background_gradient(cmap='Wistia')
Out[33]:
TotalSamples
State
Andaman and Nicobar Islands 17852
Andhra Pradesh 1115635
Arunachal Pradesh 29232
Assam 508973
Bihar 291654
Chandigarh 9253
Chhattisgarh 196150
Dadra and Nagar Haveli and Daman and Diu 35234
Delhi 724148
Goa 84945
Gujarat 441692
Haryana 342404
Himachal Pradesh 94720
Jammu and Kashmir 429787
Jharkhand 167650
Karnataka 779209
Kerala 307219
Ladakh 15067
Madhya Pradesh 449680
Maharashtra 1257564
Manipur 59740
Meghalaya 22156
Mizoram 15523
Nagaland 22697
Odisha 321443
Puducherry 24485
Punjab 369425
Rajasthan 987272
Sikkim 12012
Tamil Nadu 1491783
Telangana 140755
Tripura 77439
Uttar Pradesh 988960
Uttarakhand 86458
West Bengal 583328
In [34]:
matplotlib.rcParams["figure.figsize"] = (20,10)
plt.hist(data6,rwidth=0.8)
plt.xlabel("State")
plt.ylabel("TotalSamples")
Out[34]:
Text(0, 0.5, 'TotalSamples')

Importing and Reading the Dataset

  • For Indian Coordinates - longitude and latitude values of all states </br>
  • Covid19 Analysis - Name of State / UT Total Confirmed cases Active Cured/Discharged/Migrated Deaths
In [35]:
ic = pd.read_csv(r"D:\DATA SCIENCE\Covid19 analysis\557629_1323860_bundle_archive\datasets_555917_1128483_Indian Coordinates.csv")
ic.head()
Out[35]:
Name of State / UT Latitude Longitude Unnamed: 3
0 Andaman And Nicobar 11.667026 92.735983 NaN
1 Andhra Pradesh 14.750429 78.570026 NaN
2 Arunachal Pradesh 27.100399 93.616601 NaN
3 Assam 26.749981 94.216667 NaN
4 Bihar 25.785414 87.479973 NaN
In [36]:
cc = pd.read_csv(r"D:\DATA SCIENCE\Covid19 analysis\557629_1323860_bundle_archive\datasets_555917_1128483_Covid cases in India.csv")
cc.head()
Out[36]:
S. No. Name of State / UT Total Confirmed cases Active Cured/Discharged/Migrated Deaths
0 1 Andhra Pradesh 1583 1062 488 33
1 2 Andaman and Nicobar Islands 33 1 32 0
2 3 Arunachal Pradesh 1 0 1 0
3 4 Assam 43 9 33 1
4 5 Bihar 517 396 117 4

Map Plotting

In [37]:
df_full = pd.merge(ic,cc,on='Name of State / UT')
map = folium.Map(location=[20, 80], zoom_start=1,tiles='Stamen Toner')

for lat, lon, value, name in zip(df_full['Latitude'], df_full['Longitude'], df_full['Active'], df_full['Name of State / UT']):
    folium.CircleMarker([lat, lon],
                        radius=value*0.002,
                        popup = ('<strong>State</strong>: ' + str(name).capitalize() + '<br>'
                                '<strong>Active Cases</strong>: ' + str(value) + '<br>'),
                        color='red',
                        
                        fill_color='red',
                        fill_opacity=0.3 ).add_to(map)
map
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-37-cd78dc832b4f> in <module>
      1 df_full = pd.merge(ic,cc,on='Name of State / UT')
----> 2 map = folium.Map(location=[20, 80], zoom_start=1,tiles='Stamen Toner')
      3 
      4 for lat, lon, value, name in zip(df_full['Latitude'], df_full['Longitude'], df_full['Active'], df_full['Name of State / UT']):
      5     folium.CircleMarker([lat, lon],

NameError: name 'folium' is not defined
In [38]:
%%HTML
<div class='tableauPlaceholder' id='viz1588697469217' style='position: relative'><noscript><a href='#'><img alt=' ' src='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;44&#47;44JW7JNG3&#47;1_rss.png' style='border: none' /></a></noscript><object class='tableauViz'  style='display:none;'><param name='host_url' value='https%3A%2F%2Fpublic.tableau.com%2F' /> <param name='embed_code_version' value='3' /> <param name='path' value='shared&#47;44JW7JNG3' /> <param name='toolbar' value='yes' /><param name='static_image' value='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;44&#47;44JW7JNG3&#47;1.png' /> <param name='animate_transition' value='yes' /><param name='display_static_image' value='yes' /><param name='display_spinner' value='yes' /><param name='display_overlay' value='yes' /><param name='display_count' value='yes' /><param name='filter' value='publish=yes' /></object></div>                <script type='text/javascript'>                    var divElement = document.getElementById('viz1588697469217');                    var vizElement = divElement.getElementsByTagName('object')[0];                    if ( divElement.offsetWidth > 800 ) { vizElement.style.minWidth='420px';vizElement.style.maxWidth='650px';vizElement.style.width='100%';vizElement.style.minHeight='587px';vizElement.style.maxHeight='887px';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';} else if ( divElement.offsetWidth > 500 ) { vizElement.style.minWidth='420px';vizElement.style.maxWidth='650px';vizElement.style.width='100%';vizElement.style.minHeight='587px';vizElement.style.maxHeight='887px';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';} else { vizElement.style.width='100%';vizElement.style.height='1477px';}                     var scriptElement = document.createElement('script');                    scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';                    vizElement.parentNode.insertBefore(scriptElement, vizElement);                </script>

Bar Plotting depending on the Map Data

In [39]:
f, ax = plt.subplots(figsize=(12, 8))
data8 = df_full[['Name of State / UT','Total Confirmed cases','Cured/Discharged/Migrated','Deaths']]
data8.sort_values('Total Confirmed cases',ascending=False,inplace=True)
sns.set_color_codes("pastel")
sns.barplot(x="Total Confirmed cases", y="Name of State / UT", data=data8,
            label="Total", color="r")

sns.set_color_codes("muted")
sns.barplot(x="Cured/Discharged/Migrated", y="Name of State / UT", data=data8,
            label="Recovered", color="g")


# Add a legend and informative axis label
ax.legend(ncol=2, loc="lower right", frameon=True)
ax.set(xlim=(0, 10000), ylabel="",
       xlabel="Cases")
sns.despine(left=True, bottom=True)

How the Coronavirus cases are rising?

In [40]:
import pandas as pd
data9 = pd.read_excel(r'D:\DATA SCIENCE\Covid19 analysis\557629_1323860_bundle_archive\datasets_555917_1128483_per_day_cases.xlsx',sheet_name='India')
data9.head()
Out[40]:
Date Total Cases New Cases Days after surpassing 100 cases
0 2020-01-30 1 1 NaN
1 2020-01-31 1 0 NaN
2 2020-02-01 1 0 NaN
3 2020-02-02 2 1 NaN
4 2020-02-03 3 1 NaN
In [41]:
# Rise in COVID-19 cases in India
fig = go.Figure()
fig.add_trace(go.Scatter(x=data9['Date'], y=data9['Total Cases'],
                    mode='lines+markers',name='Total Cases'))

fig.add_trace(go.Scatter(x=data9['Date'], y=data9['New Cases'], 
                mode='lines',name='New Cases'))

        
    
fig.update_layout(title_text='Trend of Coronavirus Cases in India(Cumulative cases)',plot_bgcolor='rgb(250, 242, 242)')

fig.show()


# New COVID-19 cases reported daily in India

import plotly.express as px
fig = px.bar(data9, x="Date", y="New Cases", barmode='group',
             height=400)
fig.update_layout(title_text='New Coronavirus Cases in India per day',plot_bgcolor='rgb(250, 242, 242)')

fig.show()
In [42]:
fig = px.bar(data9, x="Date", y="Total Cases", color='Total Cases', orientation='v', height=600,
             title='Confirmed Cases in India', color_discrete_sequence = px.colors.cyclical.mygbm)

fig.update_layout(plot_bgcolor='rgb(250, 242, 242)')
fig.show()
In [43]:
from plotly.subplots import make_subplots

fig = make_subplots(
    rows=2, cols=2,
    specs=[[{}, {}],
           [{"colspan": 2}, None]],
    #subplot_titles=("India")
)
fig.add_trace(go.Scatter(x=data9['Date'], y=data9['Total Cases'],
                    marker=dict(color=data9['Total Cases'], coloraxis="coloraxis")),
              2, 1)

fig.update_layout(coloraxis=dict(colorscale='Bluered_r'), showlegend=False,title_text="Trend of Coronavirus cases")

fig.update_layout(plot_bgcolor='rgb(250, 242, 242)')
fig.show()

Importing and Reading the Dataset

IndividualDetails.csv

In [44]:
data10 = pd.read_csv(r"D:\DATA SCIENCE\Covid19 analysis\557629_1323860_bundle_archive\IndividualDetails.csv")
data10.head()
Out[44]:
id government_id diagnosed_date age gender detected_city detected_district detected_state nationality current_status status_change_date notes
0 0 KL-TS-P1 30-01-2020 20 F Thrissur Thrissur Kerala India Recovered 14-02-2020 Travelled from Wuhan
1 1 KL-AL-P1 02-02-2020 NaN NaN Alappuzha Alappuzha Kerala India Recovered 14-02-2020 Travelled from Wuhan
2 2 KL-KS-P1 03-02-2020 NaN NaN Kasaragod Kasaragod Kerala India Recovered 14-02-2020 Travelled from Wuhan
3 3 DL-P1 02-03-2020 45 M East Delhi (Mayur Vihar) East Delhi Delhi India Recovered 15-03-2020 Travelled from Austria, Italy
4 4 TS-P1 02-03-2020 24 M Hyderabad Hyderabad Telangana India Recovered 02-03-2020 Travelled from Dubai to Bangalore on 20th Feb,...
In [45]:
sns.countplot(y='gender', data=data10)
plt.show()
In [46]:
data10 = data10.detected_state.value_counts()
sns.barplot(y = data10.index, x = data10, orient='h');

Stay Home Stay Safe